67 research outputs found

    Dynamic Parameter Allocation in Parameter Servers

    Full text link
    To keep up with increasing dataset sizes and model complexity, distributed training has become a necessity for large machine learning tasks. Parameter servers ease the implementation of distributed parameter management---a key concern in distributed training---, but can induce severe communication overhead. To reduce communication overhead, distributed machine learning algorithms use techniques to increase parameter access locality (PAL), achieving up to linear speed-ups. We found that existing parameter servers provide only limited support for PAL techniques, however, and therefore prevent efficient training. In this paper, we explore whether and to what extent PAL techniques can be supported, and whether such support is beneficial. We propose to integrate dynamic parameter allocation into parameter servers, describe an efficient implementation of such a parameter server called Lapse, and experimentally compare its performance to existing parameter servers across a number of machine learning tasks. We found that Lapse provides near-linear scaling and can be orders of magnitude faster than existing parameter servers

    The DESQ framework for declarative and scalable frequent sequence mining

    Full text link
    DESQ is a general-purpose framework for declarative and scalable frequent sequence mining. Applications express their speciĄc sequence mining tasks using a simple yet powerful powerful pattern expression language, and DESQŠs computation engine automatically executes the mining task in an efficient and scalable way. In this paper, we give a brief overview of DESQ and its components

    Scalable frequent sequence mining with flexible subsequence constraints

    Get PDF
    We study scalable algorithms for frequent sequence mining under flexible subsequence constraints. Such constraints enable applications to specify concisely which patterns are of interest and which are not. We focus on the bulk synchronous parallel model with one round of communication; this model is suitable for platforms such as MapReduce or Spark. We derive a general framework for frequent sequence mining under this model and propose the D-SEQ and D-CAND algorithms within this framework. The algorithms differ in what data are communicated and how computation is split up among workers. To the best of our knowledge, D-SEQ and D-CAND are the first scalable algorithms for frequent sequence mining with flexible constraints. We conducted an experimental study on multiple real-world datasets that suggests that our algorithms scale nearly linearly, outperform common baselines, and offer acceptable generalization overhead over existing, less general mining algorithms

    Good Intentions: Adaptive Parameter Management via Intent Signaling

    Full text link
    Parameter management is essential for distributed training of large machine learning (ML) tasks. Some ML tasks are hard to distribute because common approaches to parameter management can be highly inefficient. Advanced parameter management approaches -- such as selective replication or dynamic parameter allocation -- can improve efficiency, but to do so, they typically need to be integrated manually into each task's implementation and they require expensive upfront experimentation to tune correctly. In this work, we explore whether these two problems can be avoided. We first propose a novel intent signaling mechanism that integrates naturally into existing ML stacks and provides the parameter manager with crucial information about parameter accesses. We then describe AdaPM, a fully adaptive, zero-tuning parameter manager based on this mechanism. In contrast to prior systems, this approach separates providing information (simple, done by the task) from exploiting it effectively (hard, done automatically by AdaPM). In our experimental evaluation, AdaPM matched or outperformed state-of-the-art parameter managers out of the box, suggesting that automatic parameter management is possible

    NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter Access

    Get PDF
    Parameter servers (PSs) facilitate the implementation of distributed training for large machine learning tasks. In this paper, we argue that existing PSs are inefficient for tasks that exhibit non-uniform parameter access; their performance may even fall behind that of single node baselines. We identify two major sources of such non-uniform access: skew and sampling. Existing PSs are ill-suited for managing skew because they uniformly apply the same parameter management technique to all parameters. They are inefficient for sampling because the PS is oblivious to the associated randomized accesses and cannot exploit locality. To overcome these performance limitations, we introduce NuPS, a novel PS architecture that (i) integrates multiple management techniques and employs a suitable technique for each parameter and (ii) supports sampling directly via suitable sampling primitives and sampling schemes that allow for a controlled quality--efficiency trade-off. In our experimental study, NuPS outperformed existing PSs by up to one order of magnitude and provided up to linear scalability across multiple machine learning tasks

    Discutindo a educação ambiental no cotidiano escolar: desenvolvimento de projetos na escola formação inicial e continuada de professores

    Get PDF
    A presente pesquisa buscou discutir como a Educação Ambiental (EA) vem sendo trabalhada, no Ensino Fundamental e como os docentes desta escola compreendem e vem inserindo a EA no cotidiano escolar., em uma escola estadual do município de Tangará da Serra/MT, Brasil. Para tanto, realizou-se entrevistas com os professores que fazem parte de um projeto interdisciplinar de EA na escola pesquisada. Verificou-se que o projeto da escola não vem conseguindo alcançar os objetivos propostos por: desconhecimento do mesmo, pelos professores; formação deficiente dos professores, não entendimento da EA como processo de ensino-aprendizagem, falta de recursos didáticos, planejamento inadequado das atividades. A partir dessa constatação, procurou-se debater a impossibilidade de tratar do tema fora do trabalho interdisciplinar, bem como, e principalmente, a importância de um estudo mais aprofundado de EA, vinculando teoria e prática, tanto na formação docente, como em projetos escolares, a fim de fugir do tradicional vínculo “EA e ecologia, lixo e horta”.Facultad de Humanidades y Ciencias de la Educació

    Good Intentions: Adaptive Parameter Management via Intent Signaling

    No full text
    Model parameter management is essential for distributed training of large machine learning (ML) tasks. Some ML tasks are hard to distribute because common approaches to parameter management can be highly inefficient. Advanced parameter management approaches---such as selective replication or dynamic parameter allocation---can improve efficiency, but they typically need to be integrated manually into each task's implementation and they require expensive upfront experimentation to tune correctly. In this work, we explore whether these two problems can be avoided. We first propose a novel intent signaling mechanism that integrates naturally into existing ML stacks and provides the parameter manager with crucial information about parameter accesses. We then describe AdaPM, a fully adaptive, zero-tuning parameter manager based on this mechanism. In contrast to prior parameter managers, our approach decouples how access information is provided (simple) from how and when it is exploited (hard). In our experimental evaluation, AdaPM matched or outperformed state-of-the-art parameter managers out of the box, suggesting that automatic parameter management is possible
    corecore